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Abstract. U-order is a global order on strings related to Unique Max¬ 
imal Factorization Families (UMFFs) [6]!7|, which are themselves gener¬ 
alizations of Lyndon words [14] . E-order has recently been proposed as 
an alternative to lexicographical order in the computation of suffix ar¬ 
rays and in the suffix-sorting induced by the Burrows-Wheeler transform. 
Efficient U-ordering of strings thus becomes a matter of considerable in¬ 
terest. In this paper we present new and surprising results on U-order in 
strings, then go on to explore the algorithmic consequences. 


1 Introduction 

This paper extends current knowledge on the non-lexicographic string 
ordering technique known as R-order [5]. New combinatorial insights are 
obtained which are linked to computational settings. In particular, we 
relate R-order string comparison to lexicographic by showing how it is 
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possible to traverse the strings from left to right, respectively right to left, 
at each stage determining in 0(1) time the order of prefixes, respectively 
suffixes. This improves on existing ordering algorithms mm in various 
ways: it removes any dependence on an “indexed” alphabet, it orders 
prefixes and suffixes in addition to the original strings, and it reduces 
dependence on additional data structures. Furthermore, we introduce an 
input-sensitive variant for R-order comparison. 

Regarding practical applications of R-order, in [9] a novel variant 
of the classic lexicographic Burrows-Wheeler transform, the R-transform 
(R-BWT), was introduced which was based on R-order - instances of 
enhanced data clustering were demonstrated. Linear R-sorting of all the 
rotations of a string x = x[l... n], as required for an efficient transform, 
was achieved by linear time and space R-order string comparison (Daykin 
et al. 2011) [7] along with Q(n ) suffix-sorting (Ko and Aluru, 2003) | 13] . 
Lyndon-like factorization of a string into R-words is likewise linear in time 
and space [7]. For R-words, [9] showed how to compute the R-transform in 
0{n ) time and space; in addition, inverting the R-transform to recover the 
input R-word was achieved in time 0(n 2 log k'), using 0(n+k') additional 
storage, where k! is the number of sequences of largest letters in x. A 
bijective algorithm was also outlined in the case that x is arbitrary. 

We apply the new combinatorial insights gained to modify ideas given 
in m for Lyndon factorizations, suffix arrays and the Burrows Wheeler 
transform, to similarly obtain on-line processing for R-order. 

2 Preliminaries 

Consider a finite totally ordered alphabet A which consists of a set of 
characters (equivalently letters or symbols) with cardinality |R|. A string 
is a sequence of zero or more characters over S. A string s of length 
|s| = n is represented by s[l... n\, where s[i] £ U for 1 < i < n. The 
set of all non-empty strings over the alphabet S is denoted by E + . The 
empty string with zero length is denoted by e, with S* = U e; A 
string id is a substring, or factor, of s if s = uwv, where u,v £ A*. 
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Words to[l... i\ are prefixes of w, and words w[i... n\ are suffixes of w. 
For further stringological definitions, theory and algorithmics see [3]. 

Some of our applications are derived from Lyndon words, which we 
now introduce. A string y = y[l... n\ is a conjugate (or cyclic rotation) 
of x = x[l... n] if y[ 1... n] = x[i ... n]*[l... i — 1] for some 1 < i < n 
(for f = l, y = x). A Lyndon word is a primitive word which is minimal 
for the lexicographical order (lexorder) of its conjugacy class. 

Theorem 1. Any word w can be written uniquely as a non-increasing 
product w = U\U 2 ■ ■ ■ Uk of Lyndon words. 

Theorem |T] shows that there is a unique decomposition of any word 
into non-increasing Lyndon words ( U\ > «2 > ■ • • > Uk). We proceed to 
define a non-lexicographic order, A-order, and then establish useful new 
lexicographic characteristics for A-order. 

Let x = X\X 2 ■ ■ ■ x n be a string over A. Define h G {1,..., n} by h = 1 
if x\ < X 2 < ■ ■ ■ < x n \ otherwise, by the unique value such that Xh -1 > 
Xh < x h +i < Xh +2 < ■■■ <x n . Let x* = xix 2 ■ ■ ■ x h -ix h+ i ■ ■ ■ x n , where 
the star * indicates deletion of the letter Xh- Write x s * for (...(a;*)*...)* 
with s > 0 stars. Let g = max{xi, X 2 , ■■■, x n }, and let k be the num¬ 
ber of occurrences of g in x. Then the sequence x, x* } x 2 *,... ends with 
g k ,..., g 2 , g 1 , g° = e. In the star tree each string x over A labels a vertex, 
and there is a directed edge from x to a:*, with the empty string e as the 
root. 

Definition 1. We define A-order -< between distinct strings x,y with 
x -< y. First x -< y if x is in the path y,y*,y 2 *,...,e. If x,y are not in 
a path, there exist smallest s,t such that a;( s+1 )* = yd +1 )* _ p u t s = x s * 
and t = y u ; then s t but |s| = \t\ = m say. Let j G l..m be the greatest 
integer such that s[j] t[j]. If s[j] < t[j) in A then x -< y. Clearly -< is 
a total order. 

For instance, using the natural ordering of integers, if x = 32415, then 
x* = 3245, x 2 * = 345, x 3 * = 45 and so 45 -C 32415. 
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Definition 2. J2H2F The V-form of a string x is defined as 


V k (x) = x = xogx-yg • • • x k _ 1 gx k 

for strings x j, i = 0,1,..., k, where g is the largest letter in x — thus we 
suppose that g occurs exactly k times. For clarity, when more than one 
string is involved, we use the notation g = Cx, k = Cx- 

Lemma 1. |5|-[2j/ Suppose we are given distinct strings x and y with 

corresponding V-forms as follows: 


X = XqC X XiC X X 2 ■ ■ ■ Xj-iCxXj, 

y = y^yVi^yVi ■ ■ ■ yk-i£yyk, 

where j = Cx, k = Cy. 

Let h € {0 ... max(j, fc)} be the least integer such that x k y h . Then 
x^y if, and only if, one of the following conditions holds: 

(Cl) C X < Cy 

(C2) Cx = Cy and Cx < Cy 

(C3) Cx = Cy, Cx = Cy and x k <y k - 

Lemma 2. fZJ[7|/ For given strings v and x, if v is a proper subsequence 
of x, then v -< x. 

Example 1. We compare two dictionaries for a set of English words over 
the ordered Roman alphabet. 

Lexorder(<) dictionary: catastrophe < sop < strop < strophe < top. 
The well-known lexorder positional technique seeks the first difference 
from the left and then applies the ordering of the alphabet. 

L-order (-<) dictionary: sop -< top -< strop -< strophe -< catastrophe. 
The first L-order comparison is determined by Lemma [T](C 1) and the 
following three by the useful Lemma [3 
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3 New Results on V-Order 

A main interest of this paper is to consider positional lexorder-type or¬ 
dering techniques for Worder, for which we first establish some basics. 
Given an ordered alphabet A7 = {1 < 2 < • • • } and a string x £ A7 + 
with |*| > 1, then from conditions (Cl, C2) we have, as for lexorder, 
1 -< * -< ** for all i > 1. For strings u,v,w £ E + with u -< v -< w, we 
find by Lemma [2] that, again as for lexorder, both u -< uv and vw -ft w 
(in contrast to Lyndon words). In general, for i,j > 1, we can say that 

l^,u-<u 2 -<■■■-< u l -< u z v -<■■■-< u l vi -<■■■-< u % v^w -< ■ ■ ■ 

We begin by generalizing Lemma 2.5 in [9]: 

Lemma 3. For any two strings x, y and A £ S, x -< y xX -< yX. 

Proof. Let x' = xX, y' = yX. First observe that if Cx < £y, then by 
(Cl), * -< y. Furthermore: 

• if A Cy, tliGD. x? y* by (Cl) ? bGCctusG ^ Cy = C yf - 

• if A = Cy , then x' -< y' by (C2), because C x t = C y t = A and 

C x' = l<C y r ’ 

• if A > Cy , then x' -< y' by (C3), because C x i = C y / = A, C x t = 
C y / = 1, and x y. 

Thus the lemma holds for Cx < Cy and, by the complementary argument, 
it holds also for Cy < Cx- We may assume therefore that Cx = Cy. 
Suppose then that Cx < Cy, so that by (C2), * -< y. Furthermore: 

• if A < Cx = Cy, then x' -< y' by (C2), because C x / = C* + 5 < 
Cy+5 = C y/ , where 5 = 0 (A < C x ) or 1 (A = C x ); 

• if A > Cx , then x' -< y' by (C3), because C x / = C y t = X, C x r = 
C y / = 1, and x ~< y. 

Thus the lemma holds for Cx < Cy, and as above also for Cy < Cx- 
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Suppose therefore that Cx = Cy, Cx = Cy. Then whether or not 
x -< y depends on the least value h of Lemma [T| such that Xh -< y h or 
Vh < x h- 

• If A = Cx = Cy , then h is unchanged by appending A to x and to y, 
so that, in this case, x -< y x' -< y', as required. 

• For A > Cx, we find as above that C / = C / = A, C / = C / = 1, 

x y 

the ordering of x' and y' is equivalent to the ordering of x and y. 

• Finally, suppose that A < Cx — Cy. If h Cx, then as above the 
ordering of x' ,y' corresponds to the ordering of x,y, unaffected by ap¬ 
pending A. If on the other hand h = Cx, then the problem reduces re¬ 
cursively to ordering a^A, y^A based on the ordering of Xh,yh, where 
£x h < Cx and Cy h < Cy. Thus, after a finite number of such reduc¬ 
tions, one of the above cases must hold. 

This completes the proof. □ 


Lemma 4. For any two strings x, y and A £ F, x -< y Xx -< Ay. 

Proof. The argument is analogous to that given for Lemma El Note that 
the recursive case Xxo,Xyo is likewise based on the ordering of 2>o,yo> 
where C Xq < C x and Cy 0 < Cy. □ 

Interestingly, although Lemma 0] holds for lexorder, Lemma [3] does 
not as shown by: a < ab in lexorder but ac -ft abc. 

We can now combine the above lemmas into a more general result: 

Theorem 2. For any strings u, v, x, y, x -< y uxv -< uyv. 

Proof. This follows from repeated applications of Lemmas 0] & 01 where 
we append one letter at a time to suffixes and prepend one letter at a 
time to prefixes. □ 

We can establish extensions and applications of these results: 
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Lemma 5. Let x and y be strings with V-forms 


X = x 0 C x x iC x x 2 • • • Xj-iCxXj, 


y = yoC y y l C y y 2 ■ ■ • Vk-iCyVk- 


For any letter A < ma x(C x ,Cy) and any integer i £ {0 ... max(j, k)}, let 

x' = x 0 C x ■ ■ ■ C x Xi\C x ■ ■ ■ C x x j: 
y' = y^y ■ ■ ■ C-yy^Zy ■ ■ ■ C y y k , 
x" = x 0 C x ■ ■ ■ CxXxiCx ■ ■ ■ £ X Xj , 
y" = y^y ■ ■ ■ C-yXy^Cy • • • c y y k . 

Then x' -< y' x -< y x" -< y". 

Proof. First suppose that x' -< y', so that one of the conditions (Cl)- 
(C3) of Lemma [U must hold: 

• Assume that C x , C / • Then A <C Cy and C x - L x ' < C y ' ~ c y ’ 
so that x -< y by (Cl). 

• Assume that C , = C ,, with C , < C ,. If A = Cy, then either 

(L u tL y 

< Cy or A = C x and C x = C x , — 1 < C , — 1 = Cy ] otherwise, 
A < Cy, so that C x = Cy with °x - C x r < C y , - Cy. In all three 
cases, * -< y by (C2). 

• If C , = C , and C , = C /, then whether or not x -< y depends on 

JL y Jb y 

the least value h of Lemma 1 such that Xh ^ y h : 

o if h i, then the ordering of x, y corresponds to the ordering of 
x',y', unaffected by removing A; 

o if h = i, then the ordering of x, y reduces to the ordering of 
XhX,yhX, so that x -< y by Theorem 1. 

Next suppose that x -< y. Again we consider the conditions (C1)-(C3) of 
Lemma [T] 


• Assume that C x < Cy. If A = Cy, then A = C x / = Cy/ with C x t = 
1 < C /, so that x' < y' by (C2); while if A < C y , then x' -< y' by 
(Cl), because C x < C x , < C y = C y r. 

• Assume that C x = C y , with C x < C y . If A = C x = C y , then C / = 
C x +1 < C y +1 = Cy /; if A < C x = C y , then C x t = C x < C y = Cy/. 
In both cases, x' -< y' by (C2). 

• If C x = C y and C x = C y , then again whether or not x' -< y' depends 
on the least value h of Lemma Q] such that -< y h : 

o if h 7 ^ i, then the ordering of x', y' corresponds to the ordering of 
x, y, unaffected by adding A; 

o if h = i, then the ordering of x',y' reduces to the ordering of 
*h.A, r/h-X, so that x' -< y' by Theorem [21 

This completes the proof that x' -< y' x -< y. The proof that 
x " -< y" ^ x -< y is similar. □ 

To see that Lemma O does not hold for A > rna x(C x , Cy), consider 
x = 1323 -<y = 3133, A = 4, but y' = 43133 -C x' = 14323. 


Remark 1. Lemma 0 is easily generalized by replacing A by any string 
u = uiU 2 ---u m such that, for 1 < j < m, u m < ma x(C x , Cy), and 
inserting such a u at any or all positions i £ {0 ... max(j, k)}. 

Lemma 6. For any two strings x,y and letters A,/i £ S, A < y: 

(i) x -<y ^ Xx yy; 

(ii) x -< y =>■ xX -< yy. 

Proof. For A = y, (i) reduces to Lemma[U while (ii) reduces to Lemma [3j 
Thus we may assume A < y. 

Suppose x -< y. Then by Lemma [I] Xx -< Xy, while by Theorem [2] 
with u = e, Xy -< yy. Therefore Xx yy, proving (i). The proof of (ii) 
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is similar. □ 

The following examples show that sufficiency does not hold in LemmaEl 

(i) y = 441 -< x = 442, A = 2 < y = 3, but Xx = 2442 -< /r y = 3441; 

(ii) y = 441 -< x = 442, A = 2 < y = 3, but xX = 4422 -< yy = 4413. 

4 Applications 

Some of the results presented above lead us to some interesting applica¬ 
tions. In this section, we first present a brief sketch of an idea for a new 
string comparison algorithm in 14-order and then proceed to consider ap¬ 
plications of our results to suffix arrays (SAs) and the Burrows Wheeler 
transform (BWT). 

4.1 14-Order String Comparison 

Recently, Alatabbi et al. presented an interesting 14-order string compar¬ 
ison algorithm in [HE] (referred to as the ADRS algorithm henceforth), 
where a mapping of the position of each letter in the string is exploited 
to check for the conditions stated in Lemma [TJ Note that there are three 
conditions in Lemma 0] and things get most interesting when we reach 
Condition (C3) because of its recursive nature. Now, the efficiency of 
ADRS algorithm depends on a key result (cf. Corollary 2.9 of j2J) which 
proves that the mismatch position of the two strings under comparison 
remains the same as we go deep into the recursion. This fact along with 
the result presented in Lemma [5] gives us yet another idea for an efficient 
string comparison algorithm in 14-order. Essentially, the idea builds upon 
the idea of the map in the ADRS algorithm as we will now outline. 
Suppose we are given two strings, x and y, with E-forms 


x = x 0 £ x x 1 jC x x 2 ■ ■ ■ Xj-iCxXj, 

y = y^yV\^yV2 ■ ■ ■ yk-i^yyk- 
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Step 1: We first scan the input strings from left to right to identify Cx 
and Cy and compute Cx and Cy. At this point, if we can determine 
the order using conditions (Cl) and/or (C2) of Lemma [TJ then we 
terminate immediately returning the order. 

Step 2: We compute the first mismatch position, h, between x and y\ 
that is, for 1 < i < h, we have Xi = yi and xh / yh- Now, by 
applying Lemma O we can ignore the letters to its left, because they 
are equal in x and y. Note that the case when h lies within xo(y 0 ) 
can be handled easily. 

Step 3: Assume that the nearest Cx = Cy to the right of h is at position 
i x + 1 (£ y + 1) in x (y). The case when h lies within Xj(y ; ) again can 
be handled easily. 

Step 4: Now we focus on x' = Xh..X£ x and y' = y h ..y^ y . Essentially, we 
will construct a map as is done in the ADRS algorithm. But we will 
not construct the map completely; rather we will construct only the 
part of the map that is relevant to the computation in a different way. 
To do this we count the number of occurrences of each letter a € A 
within an appropriate range as follows. We start with the highest 
letter and continue downward. Assuming that a = |27|, we use two a- 
length arrays countxl l..er] and county [L.cr] as follows. Suppose we are 
counting the number of a £ L. Then we check the leftmost occurrence 
p of > a in the range x[h..£ x \ such that there is no occurrence of 
7 > /? before p. And we count the number of occurrences of a in 
the range x[h..p — 1] and store it in countx[ot] . Similarly we compute 
county [a]. 

Step 5 : At this point, in count x [ l-.cr] (county [l..cr]) we have the fre¬ 
quency of each letter a £ Lin the appropriate range. Now the rest is 
quite easy. We scan countx, county from the higher to lower letters 
of £ as follows: 

for a = highest(U) to lowest(£ ) do 
if countx[ct] == county[a] then 
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o This means either a is nonexistent (when count is zero) or 
we are in Condition (C3). So we need to check the next letter. 

continue 

else 

> If countx [a] / county [a], then either a is nonexistent in 
x — when countx [a] is zero — or in y - when county [a] is zero. 
That is, we are in Condition (Cl) or (C2). So we have countx[o) < 
county[a] (county [a] < countx[o), respectively). 
return x -< y (y -< x, respectively) 

At this point a brief discussion is in order. Recall that the ADRS al¬ 
gorithm runs in 0(n + a) time. Because o is 0(n ), this running time is 
optimal. Therefore, we cannot get improvement asymptotically and the 
theoretical time complexity of the new algorithm matches that of the 
ADRS algorithm. However, the use of Lemma [5] gives us an opportunity 
to work much less from a practical point of view, especially for favourable 
input strings. And this is why, despite the same theoretical time complex¬ 
ity, our new algorithm is an input sensitive algorithm and in practice 
should perform better than the ADRS algorithm. 

4.2 Suffix sorting and Burrows Wheeler transformation 

The suffix permutation HU of a word w = w\W 2 ■ ■ ■ w n is the permutation 
7 Tw over {1,..., n}, where n Wi is the rank of the suffix w[i, n] in the set 
of the lexicographically sorted suffixes of w. In m it is shown how to 
deduce the Lyndon factorization (Theorem []]) of a text from its suffix 
permutation; conversely, a strategy is given in m for obtaining the suffix 
array from the Lyndon factorization of a text. 

We will outline how our new results from Section [3] can be applied to 
obtaining a lex-extension suffix array from the R-order factorization of a 
text - the distinctness of factors in a Lyndon versus R-order factorization 
of a given string mm opens more avenues for string processing (such as 
choosing the factorization with more/less factors for efficiency). 
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To elaborate, there are three main cases to be handled for the F- 
factorization algorithm VF in Eld as follows. To determine the F-order 
factorization X\ > x% ■ ■ ■ > of a string x , algorithm VF applies Lemma 
3.16 in [6] to substrings aq, Xj\ 

— If (Cl) holds for Xi, Xj ( C X i < C-Xj ) then Xi > Xj in the factorization 
- the algorithm tracks maximal elements. 

— If (C2) holds for aq, Xj then, aq < Xj if X{Xj is a Hybrid Lyndon (that 
is a Lyndon word under lex-extension 0), and XiXj is a factor in the 
factorization - the algorithm checks for concatenating repetitions. 

— If (C3) holds for aq, Xj , and if x^ -< Xj then x^Xj is a factor in the 
factorization - the algorithm compares substrings between maximal 
elements. 

As each factor is identified by algorithm VF, its rightmost position is 
recorded (procedure output) and then all housekeeping variables are re¬ 
initialized (procedure RESET) - this essentially converts the remaining 
suffix of the string into a new string to be factored with no re-visiting 
of the previously factored elements required. Hence, similarly to Duval’s 
Lyndon decomposition algorithm m, the linear F-order factoring tech¬ 
nique can be used for on-line scenarios which is the setting of our appli¬ 
cations. 

Now, we are interested in the notion of compatibility for sorting suf¬ 
fixes as introduced in m- Let x be a word and it be a substring (factor) 
of x. The sorting of suffixes si, s 2 of it, with respect to it, is compat¬ 
ible with the sorting of the suffixes of x for which si, S 2 are prefixes, 
with respect to x , if they have the same order in both it and x. It is 
shown in m that, although compatibility doesn’t always hold for lex- 
order suffix-sorting, when it is chosen to be a substring of Lyndon factors 
in a factorization then it does hold. In contrast, compatibility always 
holds for sorting suffixes in F-order, and furthermore, the shorter suffix 
is always lesser: 
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Lemma 7. Let x £ U + and u be a substring of x with s i a suffix of u. 
If S 2 is a suffix of s i then S 2 -< s i with respect to both u and x. 

Proof. Consider the suffixes siii and S 2 t 2 of x for possibly empty ti, £ 2 - 
Applying Lemma [2] then both S 2 A Si with respect to u and ^£2 ^ s i^i 
with respect to x. □ 

Lemma [2] further shows that suffixes are totally E-ordered by their 
given order: for any string x = x[l ... re], we have x n -< x n -ix n -<■■■-< x. 

However, to address applications involving conjugates of strings, such 
as the Burrows Wheeler transform, Lemma [7] doesn’t suffice for E-order: 
when using suffixes to sort all rotations of a string, since each rotation 
has the same number of maximal elements, therefore implicitly condition 
(C3) applies — for ordering these suffixes we need the first distinct prefix 
substrings of the E-forms of the suffixes. We will use lex-extension order¬ 
ing which compares factors in a factorization pair-wise from left to right 
while each comparison is made in E-order. 

Theorem 3. Let x £ S + with V -order factorization x = X\ ■ ■ ■ xk, and 
let u = Xi ■ ■ ■ Xj, for 1 < i < j < k. Then the sorting of the suffixes of u 
is compatible with the sorting of the suffixes of x. 

Proof. The case of the Lyndon factorization is Theorem 3.2 in [15] . The 
L-order proof thus follows from the Lyndon-like properties of the L-order 
factorization and by replacing lexorder with lex-extension ordering. □ 

Equipped with this theorem, the clever incremental suffix sorting Sz 
BWT strategy introduced in m can be modified for E-order: 

Step 1: Compute the E-order factorization of a; = v\ ■ ■ ■ Vk in linear 
time [HE]. 

Step 2: Compute the lex-extension order suffix array of each of v\ and 
V 2 in linear time [9]. 
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Step 3 : Obtain the BWT(t)j) from each SA(uj): for a suffix Vi = x[h ... m\ 
the BWT character is x [h — 1]. 

Step 4: Merge the sorted suffixes in Step 2 using ADRS algorithm p] to 
obtain the suffix array of v-^V 2 - For the merge, if Vj >- v &, then the 
chosen suffix for the new array is v otherwise it is vjv &. 

Step 5: Obtain the BWT of the merged sorted suffixes in Step 4. If the 
chosen suffix for the new array was Vk, then the BWT character is 
given by BWT(ufc); otherwise it is BWT(uj) since the prefix x\l... h— 

1] in x is rotated as VjVk ... *[1... h — 1]. 

Step 6: Compute the lex-extension order suffix array of and merge it 
with the suffix array of from Step 4 and obtain the BWT. 

Step 7: Repeat until all the U-factors have been incrementally processed. 

Overall, for iterating over k factors, the time complexity is 0(k 2 n), 
with each iteration taking 0(kn). As expressed in [15j for the Lyndon case, 
this technique is suitable for integration with the on-line V -order factoring 
algorithm: suffix sorting can proceed in tandem as soon as the first V- 
factor is identified. Note that in Step 4 above, the new string comparison 
algorithm presented in Section 14.11 can be applied when input-sensitivity 
is relevant. 

5 Future Research 

We propose the following problem: Suppose that x,y £ U + with x -< y. 
Under what permutations 7r, that is, x —> tt(x) and y —> i r(y) does tt{x) -< 

7r (y) hold? For instance, for integers, 21 -< 12 and no permutation works; 
whereas interchanging the first and last letters does for 142 -< 243 since 
241 -< 342, which generalizes to requiring that the rightmost substrings 
of their U-forms are in U-order. 

We propose studying such permutations in the context of the gene 
team problem: to find a set of genes that appear in two or more species, 
possibly in a different order, but within a given distance in each chro- 
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mosome - this has impact in understanding genome evolution and func¬ 
tion [16lj. 
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